1 Quick introduction to TissueMiner and the R-language

# This is a comment: the code below will print "Welcome to TissueMiner"
print("Welcome to TissueMiner")
## [1] "Welcome to TissueMiner"

1.1 TissueMiner automated workflow

  • Instructions to run the automated workflow can be found here
  • Here is a list of important files that are generated by the automated workflow
    • The database: (<movie_name>.sqlite)
    • An extra table (cellshapes.RData) to represent cell contours by using anticlockwisely ordered cell vertices
    • An extra table (./roi_bt/lgRoiSmoothed.RData) to store cells in user-defined regions of interest
    • An extra table (./topochanges/t1DataFilt.RData) to store cell neighbor changes
    • An extra table (./shear_contrib/triangles.RData) to store triangles
    • Extra tables (./shear_contrib/<ROI_name>/avgDeformTensorsWide.RData) to store the calculated pure shear deformation of triangles and tissue for each region of interest

1.2 Install TissueMiner routines and Rstudio

1.2.1 Local TissueMiner installation

1.2.1.1 Method 1

## Download TissueMiner (in your home folder):
# go to https://github.com/mpicbg-scicomp/tissue_miner
# click on 'Clone in Desktop' or 'Download ZIP'

#### Open a terminal and execute the lines below ####

## Set a path to the tissue_miner folder and export it to the global environement
echo "export TM_HOME=~/tissue_miner" >> .bash_profile
source .bash_profile

1.2.1.2 Method 2

#### Open a terminal and execute the lines below ####

## Set a path to install TissueMiner in your home folder (please use an absolute path)
echo "export TM_HOME=~/tissue_miner" >> .bash_profile
source .bash_profile

## If Git isn't installed yet, please install it
# On Ubuntu
sudo apt-get install git

# On MacOs
http://git-scm.com/download/mac

## download TissueMiner (require git to be installed)
git clone https://github.com/mpicbg-scicomp/tissue_miner.git ${TM_HOME}

1.2.2 Rstudio installation

Please, follow the instructions here to install RStudio desktop.

1.2.3 Avconv installation

## install avconv (only needed for movie rendering)
# On Ubuntu
sudo apt-get install --assume-yes libav-tools
# On Mac, please visit
http://ffmpegmac.net/
or
http://superuser.com/questions/568464/how-to-install-libav-avconv-on-osx

1.3 Load TissueMiner routines in Rstudio

  • Please modify the paths (first chunk below) according to your data location
  • Always execute the code below before running the analysis
# Define path to all processed movies: TO BE EDITED BY THE USER
movieDbBaseDir="/media/project_raphael@fileserver/movieSegmentation"

# Define a working directory where to save the analysis: TO BE EDITED BY THE USER
outDataBaseDir="/home/etournay/Documents"
# Set up path to the TissueMiner code
# This command requires that the global environment TM_HOME is defined in the .bash_profile
scriptsDir=Sys.getenv("TM_HOME")
scriptsDir=Sys.getenv("TM_DEV")

# Load TissueMiner libraries
source(file.path(scriptsDir, "commons/TMCommons.R"))
source(file.path(scriptsDir, "db/movie_rotation/RotationFunctions.R"))
source(file.path(scriptsDir, "commons/BaseQueryFunctions.R"))
source(file.path(scriptsDir, "commons/TimeFunctions.R"))
source(file.path(scriptsDir, "config/flywing_tm_config.R"))

# Load a R library
library("zoo")

# Set up working directory
mcdir(outDataBaseDir)

## Set general theme for graphs: more specific tuning must be done for each graph
theme_set(theme_bw())
theme_update(panel.grid.major=element_line(linetype= "dotted", color="black", size=0.2),
             panel.border = element_rect(size=0.3,color="black",fill=NA),
             axis.ticks=element_line(size=0.2),
             axis.ticks.length=unit(0.1,"cm"),
             legend.key = element_blank()
)

# Hardwire isotropic deformation color scheme
isotropColors <- c("division"="orange",
                   "extrusion"="turquoise",
                   "cell_area"="green",
                   "sumContrib"="blue",
                   "tissue_area"="darkred")

## hardwire the movie color scheme
movieColors <- c("WT_25deg_111102"="blue",
          "WT_25deg_111103"="darkgreen",
          "WT_25deg_120531"="red"
)

1.4 R: the basics

Many books or web sites describe the R language, and we only introduce the necessary knowledge to understand this tutorial. We recommend of few references that have been useful to us:

  • The art of R programming
  • The R cookbook
  • The R graphics cookbook
  • And some websites…

1.4.1 Variable assignment and simple instructions

# assign a number to the variables x and y
x <- 2
y <- 3
# display the result of x + y
x + y
## [1] 5
# is x equal y?
x==y
## [1] FALSE
# is x different from y?
x!=y
## [1] TRUE
# is x superior to y? 
x>y
## [1] FALSE
# is x inferior to y?
x<y
## [1] TRUE

1.4.2 A vector is a series of values.

# assign a vector to x and to y:
x <- c(4,3,2)
y <- c(1,2,3)
# assign a bolean vector to z:
z <- c(TRUE,FALSE,TRUE)
# display the result of x + y (element-wise addition):
x + y
## [1] 5 5 5
# display the result of x + y + z (z is automatically coerced to integers)
x + y + z
## [1] 6 5 6

1.4.3 Named vectors

In some cases, it is covenient to name each element of the vector. Such a vector is usefull to store configuration parameters.

# assign a named vector to x:
x <- c("movie1"="red", "movie2"="blue", "movie3"="green")
# display the content of x
x
##  movie1  movie2  movie3 
##   "red"  "blue" "green"

1.4.4 Tabular data: dataframe

Tabular data that we obtain from the relational database are stored in a table refered to as dataframe in the R language. This tutorial essentially shows how to manipulate dataframes in order to perform calculations and prepare the data for plotting. A dataframe is composed of columns that correspond to vectors of identical length.

# Assign a data frame to x:
x <- data.frame(frame=c(1,2,3), cell_area=c(20,22,24))
# display the content of x:
x
##   frame cell_area
## 1     1        20
## 2     2        22
## 3     3        24
# display the number of lines in x:
nrow(x)
## [1] 3
# display the 2 first rows of x:
head(x, n=2)
##   frame cell_area
## 1     1        20
## 2     2        22
# display the 2 last rows of x:
tail(x, n=2)
##   frame cell_area
## 2     2        22
## 3     3        24

1.5 How to query a relational database ?

1.5.1 Open a connection to the database

  • Database format: SQLite
  • Open a connection to one database: openMovieDb() function using the RSQLite package
  • The name of the time-lapse is used to identify the corresponding database
  • The SQLite connection is assigned to a “db” variable.
# Define path to all time-lapses
movieDbBaseDir <- "/media/project_raphael@fileserver/movieSegmentation"
# Define path a particular time-lapse called "WT_25deg_111102"
movieDir <- file.path(movieDbBaseDir, c("WT_25deg_111102"))
# Connection to the DB stored in the "db" variable
db <- openMovieDb(movieDir)
# Close the connection
dbDisconnect(db)

1.5.2 Query the database using the SQL language

  • Simplicity of the SQL language: only three words select, from, and where are sufficient to perform database queries: one can select the desired columns from a given table where the rows of a given column fulfill a user defined criterium.
  • A SQL query results in a table or dataframe that we assign to a variable in the R language
  • More complicated SQL queries are possible, but we will instead use the grammar of data manipulation provided in R to manipulate the dataframes in the computer memory.
# Use the built-in "dbGetQuery" function to query the database "db" using a SQL statement in quotes
# Assign the resulting data frame to the "cellProperties" variable
cellProperties <- dbGetQuery(db, "select cell_id, frame, area from cells")
# show first lines of the table
head(cellProperties)
##   cell_id frame area
## 1   10000     0    0
## 2   10000     1    0
## 3   10000     2    0
## 4   10000     3    0
## 5   10000     4    0
## 6   10000     5    0
# Filter out the margin cell (id 10000) around the tissue
cellProperties <- dbGetQuery(db, "select cell_id, frame, area from cells where cell_id!=10000") 
# Select all columns of a table
allCellProperties <- dbGetQuery(db, "select * from cells where cell_id!=10000") 

1.6 Manipulate large data sets using a grammar of data manipulation

  • Here, we briefly introduce the main verbs and the syntax of the grammar of data manipulation supplied by the dplyr package. In practice, just a single operator and about 5 verbs only are sufficient to effectively manipulate data. We also encourage the user to download the Rstudio cheat sheet here in which the grammar is summarized.

  • Simply stated, this grammar allows the user to chain a series of operations by using the pipe operator %>%. In each step of the chain, every intermediate result is taken as an input for the next operation. Each type of operation on dataframes is identified by a verb.

  • This grammar also allows the user to chain other built-in R-functions or custom ones.

In the present tutorial, we mainly use the following few verbs:

Functions Description Package
dbGetQuery query a SQLite database and retuns a dataframe RSQLite
mutate perform calculations on columns by adding or modifying existing ones dplyr
summarize compute summary statistics dplyr
group_by (and ungroup) subsest data into chunks prior to a mutate or a summarize operation dplyr
filter parse data on row content dplyr
select parse data on column names dplyr
arrange order values of desired columns dplyr
inner_join merge two data frames by intersecting user-defined columns dplyr
melt or gather gather columns into rows reshape2 or dplyr respectively
dcast or spread spread rows into columns reshape2 or dplyr respectively

  • This grammar can be easily extended by the user and can be used in combination with important TissueMiner functions for visualizing and quantifying cell dynamics in 2D-living tissues:
Functions Description Project
print_head display the first lines of the current table and the total number of rows in the table without affecting the data content TissueMiner
dt.merge fast merging of two dataframes with the possiblity to suffix colunms having identical names TissueMiner
openMovieDb open a connection to a database of a selected movie TissueMiner
multi_db_query query multiple databases and aggregate data into a dataframe TissueMiner
coarseGrid assign grid elements to spatial quantities provided their positions (center_x and center_y) are present TissueMiner
smooth_tissue average quantities in time using a moving window (convolution) applied by grid elements TissueMiner
align_movie_start align movies at earliest common developmental time TissueMiner
chunk_time_into_intervals undersample time for local time averaging TissueMiner
synchronize_frames find closest frame to user-defined time intervals TissueMiner
mqf_* functions set of multi-query functions to quantify cell dynamics TissueMiner

1.6.1 Learning the grammar on an example

Aim: calculate the average cell area in square microns as function of time in hours from start of time-lapse recording.

Howto:

  • use the dbGetQuery() function to input a dataframe to start the chain of operations
  • use the %>% operator to chain operations
  • manipulate the input dataframe using the dplyr grammar
# We query the DB to get cell area and pipe the resulting table the next function
avgCellArea <- dbGetQuery(db, "select cell_id, frame, area from cells") %>%
  # remove the huge artificial margin cell around the tissue
  filter(cell_id!=10000) %>%
  # convert pixel to squared microns knowing that 1px = 0.207 micron
  mutate(area_real=(0.207)^2*area) %>%
  # indicate that the next function must be applied frame-wise 
  group_by(frame) %>%
  # calculate the average area in each frame of the time-lapse
  summarize(area_avg=mean(area_real)) %>%
  # cancel grouping
  ungroup() %>%
  # bring time in seconds into the current table by matching the frame number
  inner_join(dbGetQuery(db, "select * from frames"), by="frame") %>%
  # convert time to hours
  mutate(time_h=round(time_sec/3600, 1)) %>%
  # remove the unecessary columns
  select(-c(frame, time_sec)) %>%
  # order time chronologically
  arrange(time_h)

1.6.2 Extending the grammar

For convenience, we built a custom print_head() function to display the first lines of the current dataframe, without affecting the data content. The print_head() function can therefore be placed whereever needed in the chain of operations:

# Here, is again example 1, but we display the first and last steps using print_head()
avgCellArea <- dbGetQuery(db, "select cell_id, frame, area from cells") %>% print_head() %>%
  filter(cell_id!=10000) %>% 
  mutate(area_real=(0.207)^2*area) %>% 
  group_by(frame) %>% 
  summarize(area_avg=mean(area_real)) %>% 
  ungroup() %>% 
  inner_join(dbGetQuery(db, "select * from frames"), by="frame") %>% 
  mutate(time_h=round(time_sec/3600, 1)) %>% 
  select(-c(frame, time_sec)) %>% 
  arrange(time_h) %>% print_head()
##   cell_id frame area
## 1   10000     0    0
## 2   10000     1    0
## 3   10000     2    0
## 4   10000     3    0
## 5   10000     4    0
## 6   10000     5    0
## [1] 3237249
## Source: local data frame [6 x 2]
## 
##   area_avg time_h
##      (dbl)  (dbl)
## 1 25.66822    0.0
## 2 25.36203    0.1
## 3 25.06019    0.2
## 4 24.72758    0.2
## 5 24.41878    0.3
## 6 24.13988    0.4
## [1] 201

1.6.3 Vectorized conditional statement (ifelse)

The R language provides a vectorized ifelse() function that we can then use in combination with the dplyr grammar. The vectorized ifelse() function takes 3 arguments corresponding to the condition (if), the consequent (then), and the alternative (else).

# Here, is an example in which we display each intermediate step
cell <- dbGetQuery(db, "select cell_id, frame, area, elong_xx, elong_xy from cells") %>% 
  # additional column isMarginCell to flag the margin cell as "true"
  mutate(isMarginCell=ifelse(cell_id==10000, TRUE, FALSE)) %>% print_head()
##   cell_id frame area elong_xx elong_xy isMarginCell
## 1   10000     0    0        0        0         TRUE
## 2   10000     1    0        0        0         TRUE
## 3   10000     2    0        0        0         TRUE
## 4   10000     3    0        0        0         TRUE
## 5   10000     4    0        0        0         TRUE
## 6   10000     5    0        0        0         TRUE
## [1] 3237249

1.6.4 Modify table layout into wide or long formats

1.6.4.1 Wide to long format: the melt() or gather() function.

The melt() (or gather()) function creates two columns:

  • one ‘variable’ column listing variable names
  • one ‘value’ column with their corresponding value.

Both melt() and gather() are equivalent, gather() being the newest implementation from the dplyr package.

# Example 1: 
# by default, melt() only gathers numerical data into a pair of {variable, value} columns
longFormat <- melt(cell) %>% print_head()
##   isMarginCell variable value
## 1         TRUE  cell_id 10000
## 2         TRUE  cell_id 10000
## 3         TRUE  cell_id 10000
## 4         TRUE  cell_id 10000
## 5         TRUE  cell_id 10000
## 6         TRUE  cell_id 10000
## [1] 16186245
# by default, gather() gathers all columns
longFormat <- gather(cell) %>% print_head()
##       key value
## 1 cell_id 10000
## 2 cell_id 10000
## 3 cell_id 10000
## 4 cell_id 10000
## 5 cell_id 10000
## 6 cell_id 10000
## [1] 19423494
# Of note, the two columns {cell_id, frame} uniquely define each cell in frame with its associated properties
# Therefore, to keep consistent data, the frame column should not be gathered

# Example 2: specify which columns to gather into {variable, value} columns
longFormat <- melt(cell, measure.vars = c("area","elong_xx","elong_xy","isMarginCell")) %>% print_head()
##   cell_id frame variable value
## 1   10000     0     area     0
## 2   10000     1     area     0
## 3   10000     2     area     0
## 4   10000     3     area     0
## 5   10000     4     area     0
## 6   10000     5     area     0
## [1] 12948996
# Or
longFormat <- gather(cell, variable, value, c(area,elong_xx,elong_xy,isMarginCell)) %>% print_head()
##   cell_id frame variable value
## 1   10000     0     area     0
## 2   10000     1     area     0
## 3   10000     2     area     0
## 4   10000     3     area     0
## 5   10000     4     area     0
## 6   10000     5     area     0
## [1] 12948996
# Example 3: specify which columns shouldn't be gathered (equivalent to example 2)
longFormat <- melt(cell, id.vars =  c("cell_id","frame")) %>% print_head()
##   cell_id frame variable value
## 1   10000     0     area     0
## 2   10000     1     area     0
## 3   10000     2     area     0
## 4   10000     3     area     0
## 5   10000     4     area     0
## 6   10000     5     area     0
## [1] 12948996
# Or
longFormat <- gather(cell, variable, value, -c(cell_id,frame)) %>% print_head()
##   cell_id frame variable value
## 1   10000     0     area     0
## 2   10000     1     area     0
## 3   10000     2     area     0
## 4   10000     3     area     0
## 5   10000     4     area     0
## 6   10000     5     area     0
## [1] 12948996

1.6.4.2 Lond to wide format: the dcast() or spread() function

The dcast() (or spread()) function creates as many columns as variable names contained in the ‘variable’ column and lists the corresponding values. Both dcast() and spread() are equivalent, spread() being the newest implementation from the tidyr package.

# The melt operation is reversible (the row identifiers must be uniquely defined), but booleans area coerced into numeric format
# Using dcast(), cell_id and frame are the row identifiers, wherease the variable column is spread into column names
example <- cell %>% print_head() %>%
  melt(id.vars =  c("cell_id","frame")) %>% print_head() %>%
  dcast(cell_id+frame~variable, value.var="value") %>% print_head()
##   cell_id frame area elong_xx elong_xy isMarginCell
## 1   10000     0    0        0        0         TRUE
## 2   10000     1    0        0        0         TRUE
## 3   10000     2    0        0        0         TRUE
## 4   10000     3    0        0        0         TRUE
## 5   10000     4    0        0        0         TRUE
## 6   10000     5    0        0        0         TRUE
## [1] 3237249
##   cell_id frame variable value
## 1   10000     0     area     0
## 2   10000     1     area     0
## 3   10000     2     area     0
## 4   10000     3     area     0
## 5   10000     4     area     0
## 6   10000     5     area     0
## [1] 12948996
##   cell_id frame area elong_xx elong_xy isMarginCell
## 1   10000     0    0        0        0            1
## 2   10000     1    0        0        0            1
## 3   10000     2    0        0        0            1
## 4   10000     3    0        0        0            1
## 5   10000     4    0        0        0            1
## 6   10000     5    0        0        0            1
## [1] 3237249
# Or
example <- cell %>% print_head() %>%
  gather(variable, value, -c(cell_id,frame)) %>% print_head() %>%
  spread(variable,value) %>% print_head()
##   cell_id frame area elong_xx elong_xy isMarginCell
## 1   10000     0    0        0        0         TRUE
## 2   10000     1    0        0        0         TRUE
## 3   10000     2    0        0        0         TRUE
## 4   10000     3    0        0        0         TRUE
## 5   10000     4    0        0        0         TRUE
## 6   10000     5    0        0        0         TRUE
## [1] 3237249
##   cell_id frame variable value
## 1   10000     0     area     0
## 2   10000     1     area     0
## 3   10000     2     area     0
## 4   10000     3     area     0
## 5   10000     4     area     0
## 6   10000     5     area     0
## [1] 12948996
##   cell_id frame area elong_xx elong_xy isMarginCell
## 1   10000     0    0        0        0            1
## 2   10000     1    0        0        0            1
## 3   10000     2    0        0        0            1
## 4   10000     3    0        0        0            1
## 5   10000     4    0        0        0            1
## 6   10000     5    0        0        0            1
## [1] 3237249

1.7 Visualize complex data sets using a grammar of graphics

  • Here, we briefly introduce the main verbs and the syntax of the grammar of data visualization supplied by the ggplot2 package. In practice, just a single operator and a few visual marks are sufficient to effectively plot data. We also encourage the user to download the corresponding Rstudio cheat sheet here regarding data visualization with ggplot2.

  • Simply stated, this grammar allows the user to chain multiple graphical layers to contruct a graph by using the plus operator +, thereby improving the clarity of the code for complex graphs.

Some geometrical layers (common types of graphs):

Function Description Package or project
ggplot map data to graph elements (axes, colors, etc…) ggplot2
geom_point plot data as points ggplot2
geom_line join the points by lines ggplot2
geom_segment plot a segment such as the representation of a nematic tensor or a cell bond ggplot2
geom_polygon plot a polygon such as the representation of a cell ggplot2
render_frame plot data onto one movie image TissueMiner
render_movie plot data onto every movie image and make a movie TissueMiner

Some complementary scaling layers:

Function Description Package or project
scale_x_continuous or scale_y_continuous to control the x and y axes rendering ggplot2
scale_color_gradientn to use a gradient of colors when rendering the data ggplot2

Saving a graph in the desired format (raster or vector graphics)

Function Description Package or project
ggsave2 ggsave2() is a wrapper to the default ggsave() function in order to add more functionality and we use ggsave2() in the rest of this tutorial TissueMiner (using ggplot2)

Example:

Aim: plot the average cell area in square microns as function of time in hours from start of time-lapse recording:

  • use the ggplot() function
  • ggplot’s first argument is the dataframe containing the data to be ploted
  • ggplot’s aes() function: to map the data to the system of coordinates
# Show the first rows of the previously calculated avgCellArea data frame:
head(avgCellArea)
## Source: local data frame [6 x 2]
## 
##   area_avg time_h
##      (dbl)  (dbl)
## 1 25.66822    0.0
## 2 25.36203    0.1
## 3 25.06019    0.2
## 4 24.72758    0.2
## 5 24.41878    0.3
## 6 24.13988    0.4
# Map the data to the system of coordinates using ggplot
ggplot(avgCellArea, aes(x = time_h, y = area_avg)) +
  # plot the average area as a line using geom_line
  geom_line() +
  # add a title
  ggtitle("Average cell area as function of time")
  • save the graph with ggsave2()
# Save the plot as svg for editing in Inkscape
ggsave2(width=14, unit="in", outputFormat="svg")
## [1] "Average cell area as function of time.svg"

2 Apply the R-grammar to visualize cells

# Load data into the 'cellshapes' variable
cellshapes <- locload(file.path(movieDir, "cellshapes.RData")) %>% print_head()
## Source: local data frame [6 x 5]
## 
##   frame cell_id    x_pos    y_pos bond_order
##   (int)   (int)    (dbl)    (dbl)      (dbl)
## 1     0   10001 133.1737 1555.900          1
## 2     0   10001 137.7239 1564.635          2
## 3     0   10001 153.0050 1568.695          3
## 4     0   10001 154.8771 1566.575          4
## 5     0   10001 154.2572 1556.595          5
## 6     0   10001 145.5287 1545.114          6
## [1] 19250452

2.1 Example 1: plot cells as polygons in the Cartesian system

ggplot(cellshapes %>% filter(frame==70)) +
  # plot cells as polygons:
  geom_polygon(aes(x_pos, y_pos, group=cell_id),color="green",fill="white", size=0.3) +
  # X and Y axes must have the same scale:
  coord_equal() +
  # add a title "frame" followed by a 3-digit padded number:
  ggtitle("Pupal wing cells represented as polygons in Cartesian system")

2.2 Example 2: plot cells as polygons in the image coordinate system

ggplot(cellshapes %>% filter(frame==70)) +
  geom_polygon(aes(x_pos, y_pos, group=cell_id),color="green",fill="white", size=0.3) + 
  coord_equal() +
  # In an image coordinate system, the Y-axis is pointing downwards. We flip the Y-axis:
  scale_y_continuous(trans = "reverse") +
  ggtitle("Pupal wing cells represented as polygons in image coordinate system")

2.3 Example 3: plot cells and vertices

ggplot(cellshapes %>% filter(frame==70)) +
  geom_polygon(aes(x_pos, y_pos, group=cell_id),color="green",fill="white", size=0.3) +
  # plot each vertex as a point:
  geom_point(aes(x_pos, y_pos),color="red", size=0.4) + 
  coord_equal() +
  scale_y_continuous(trans = "reverse") +
  ggtitle("Pupal wing cells and vertices")

2.4 Example 4: overlay cells and vertices on the image

We can now overlay cells and vertices on the wing image. To do so, we built a dedicated render_frame() function that loads the specified frame of the time-lapse. This function takes the cell contour table and a desired frame as input variables. The render_frame() function returns the first layers of the graph that consists of a raster image of the wing and additional specifications such as the Y-axis flipping - scale_y_continuous(trans = “reverse”) - and the iso-scaling of the X and Y axes - coord_equal().

# Plot cells and vertices on the original image
cellshapes %>%
  # add overlay image (! connection to DB required !):
  render_frame(70) +
  geom_polygon(aes(x_pos, y_pos, group=cell_id), color="green",fill=NA, size=0.2) + 
  geom_point(aes(x_pos, y_pos),color="red", size=0.4) +
  ggtitle("Cells and vertices overlaid on the image")

2.5 Further reading: the render_frame() function

Please, read the current definition of the render_frame() function at the following location

3 Working with regions of interest (ROIs)

The automated workflow includes routines to browse the cell lineage and to follow ROIs in time once defined on a given image of the timelapse. Please note that cells in contact with the margin are discarded because the segmentation and tracking quality isn’t optimum near the margin.

Example: Visualize cells in selected ROIs

# Load tracked ROIs: 
lgRoiSmoothed <- locload(file.path(movieDir, "roi_bt/lgRoiSmoothed.RData")) %>% print_head() %>%
  filter(roi %in% c("hinge", "blade", "HBinterface")) 
##     roi cell_id
## 1 blade   24475
## 2 blade   10009
## 3 blade   24476
## 4 blade   10103
## 5 blade   41039
## 6 blade   41040
## [1] 76774
# Load cell shapes for plotting on the wing
cellshapes <- locload(file.path(movieDir, "cellshapes.RData"))

# Merge ROI with cell polygonal definition
cellshapesWithRoi <- dt.merge(cellshapes, lgRoiSmoothed, by="cell_id", allow.cartesian=T) %>%
  arrange(frame, cell_id, bond_order) ## .. because merge messed up the ordering

# Plot  ROI on the wing
render_frame(cellshapesWithRoi, 200) + 
  geom_polygon(aes(x_pos, y_pos, fill=roi, group=cell_id), alpha=0.5) +
  scale_fill_manual(values=c("blade"="darkgreen",
                             "hinge"="yellow",
                             "HBinterface"="red"))

4 Make videos

Videos are helpful to visualize the time evolution of patterns

Here, we show use a parallelized loop over all frames of the time-lapse. The well-known avconv (formerly ffmpeg) program to create videos must be installed on your computer, please, visit http://ffmpegmac.net/ for Mac users or simply “sudo apt-get install libav-tools” on Ubuntu-Linux.

# Make a video of the ROIs on the wing
render_movie(cellshapesWithRoi, "bt_bhfix_peeled.mp4", list(
          geom_polygon(aes(x_pos, y_pos, fill=roi, group=cell_id),  alpha=0.5),
          scale_fill_manual(values=c("blade"="darkgreen",
                             "hinge"="yellow",
                             "HBinterface"="red"))))

5 A TissueMiner library to visualize cell dynamics

TissueMiner provide a set of tools to quantify and visualize cell dynamics at different spatial scales. These tools are all prefixed with ‘mqf’ as they perform multiple queries to the pre-processed data obtained with the TissueMiner automated workflow. Their common features are:

Mendatory argument: a path to a selected movie directory

Optional arguments: to control selected ROIs and other paramters that are specific to some subsets of mqf functions

5.1 Fine-grained analyses

  • mqf_fg_* functions generate fine-grained data (cellular scale) that
    • are mapped to all ROIs by default
    • include bond, cell or triangle contours for plotting
    • perform an automatic scaling of nematics for an optimal display on the original image
mqf_fg_* functions Description Source data
mqf_fg_nematics_cell_elong get cell elongation nematics from the DB DB,cellshapes.RData
mqf_fg_unitary_nematics_CD calculate cell division unitary nematics DB, cellshapes.RData
mqf_fg_unitary_nematics_T1 calculate cell neighbor change unitary nematics DB, t1DataFilt.RData, topoChangeSummary.RData, cellshapes.RData
mqf_fg_cell_area get cell area from the DB DB, cellshapes.RData
mqf_fg_triangle_properties get calculated triangle state properties DB, Ta_t.RData, triList.RData,triangles.RData
mqf_fg_bond_length get bond length and positions from the DB DB
mqf_fg_cell_neighbor_count calculate cell neighbor number from the DB DB, cellshapes.RData
mqf_fg_dev_time get developmental time from the configuration file DB

5.2 Coarse-grained analyses: per frame and by ROIs

  • The mqf_cg_roi_* functions:
    • perform spatial (by ROI) and temporal (kernSize option) averaging of quantities
    • perform an automatic scaling of nematics for an optimal display on the original image
mqf_cg_roi_* functions Description Source data
mqf_cg_roi_cell_count count cell number DB
mqf_cg_roi_cell_area coarse-grain cell area DB
mqf_cg_roi_cell_neighbor_counts average cell neighbor count DB
mqf_cg_roi_polygon_class average and trim cell polygon class DB
mqf_cg_roi_triangle_elong coarse-grain cell elongation using triangles as a proxy DB
mqf_cg_roi_rate_CD average cell division rate DB
mqf_cg_roi_rate_T2 average extrusion rate DB
mqf_cg_roi_rate_T1 average neighbor change rate DB
mqf_cg_roi_rate_isotropic_contrib coarse-grain isotropic tissue deformation and its cellular contributions DB
mqf_cg_roi_rate_shear coarse-grain anisotropic tissue deformation and its cellular contributions DB
mqf_cg_roi_nematics_cell_elong coarse-grain cell elongation nematics by ROI see mqf_fg_nematics_cell_elong()
mqf_cg_roi_unitary_nematics_CD coarse-grain division unitary nematics DB
mqf_cd_roi_unitary_nematics_T1 coarse-grain neighbor change unitary nematics DB

5.3 Coarse-grained analyses: per frame and by square-grid elements

  • The mqf_cg_grid_* functions:
    • perform spatial (by grid element) and temporal (kernSize option) averaging of quantities
    • perform an automatic scaling of nematics for an optimal display on the original image
mqf_cg_grid_* functions Description Source data
mqf_cg_grid_nematics_cell_elong coarse-grain cell elongation nematics DB
mqf_cg_grid_unitary_nematics_CD coarse-grain division unitary nematics DB
mqf_cg_grid_unitary_nematics_T1 coarse-grain neighbor change unitary nematics DB

6 Render bonds length pattern

To render cell bonds one must get bonds and their respective positions. Here, is an example in which different related tables must be joined together to pool the relevant data to be plotted:

For the sake of clarity, we built a dedicated mqf_fg_bond_length() function to get bond properties along with bond positions for plotting. Please, read its definition here.


# we use the movieDir variable defined above
bond_with_vx <- mqf_fg_bond_length(movieDir, "blade") %>% print_head()
##             movie frame cell_id bond_id vertex_id.2 vertex_id.1  x_pos.1   y_pos.1  x_pos.2   y_pos.2 bond_length   roi time_sec timeInt_sec
## 1 WT_25deg_111102     0   10002   25707       17163       17164 3260.795  339.6792 3265.463  366.4413     28.2426 blade        0         287
## 2 WT_25deg_111102     0   10002   25853       17164       17261 3288.239  345.9900 3260.795  339.6792     31.1421 blade        0         287
## 3 WT_25deg_111102     0   10002   25934       17261       17314 3295.405  364.5816 3288.239  345.9900     21.4853 blade        0         287
## 4 WT_25deg_111102     0   10004   22324       14781       14782 2712.391  914.7809 2700.848  922.5113     14.8995 blade        0         287
## 5 WT_25deg_111102     0   10004   22536       14782       14921 2724.052  925.0779 2712.391  914.7809     15.5563 blade        0         287
## 6 WT_25deg_111102     0   10005   21091       13968       13969 2525.682 1102.7165 2534.659 1118.1898     19.3137 blade        0         287
##   time_shift dev_time
## 1      54000       15
## 2      54000       15
## 3      54000       15
## 4      54000       15
## 5      54000       15
## 6      54000       15
## [1] 4202428

bond_with_vx %>%
  render_frame(70) + 
  # bonds are represented by segments using geom_segment
  geom_segment(aes(x=x_pos.1, y=y_pos.1,
                   xend=x_pos.2, yend=y_pos.2,
                   color=bond_length), # Here bond_length values are mapped to the color
               size=0.3, lineend="round") +
  # we overwrite the default color map by a custom rainbow palette
  scale_color_gradientn(name="bond_length",
                        colours=c("black", "blue", "green", "yellow", "red"),
                        limits=c(0,quantile(bond_with_vx$bond_length, probs = 99.9/100)),
                        na.value = "red") +
  ggtitle("Color-coded pattern of bond length")
# Here use the render_movie function
render_movie(dbond_2vx, "BondLengthPattern.mp4", list(
  geom_segment(aes(x=x_pos.1, y=y_pos.1,xend=x_pos.2, yend=y_pos.2,color=bond_length), 
               size=0.3, lineend="round") ,
  scale_color_gradientn(name="bond_length",
                        colours=c("black", "blue", "green", "yellow", "red"),
                        limits=c(0,quantile(dbond_2vx$bond_length, probs = 99.9/100)),
                        na.value = "red") # outliers are red
))